Speech spectrum representation and coding using multigrams with distance
نویسندگان
چکیده
The multigrams allow us to split a string of symbols into a stream of variable length sequences. The direct application of this method to vector-quantized speech spectra fails, we develop an extension of the method called modiied multi-grams or multigrams with distance. The algorithm for mod-iied multigram dictionary training as well as experimental results are presented. We found a signiicant improvement of rate/distortion ratio in comparison to vector quantization with small codebooks. For precise spectrum representation, this method is less suitable and we see its application rather in speech segmentation or in very low bit rate coding.
منابع مشابه
A study of line spectrum pair frequencies for vowel recognition
The line spectrum pair (LSP) frequency represer.iation has recent:y been proposed as an alternative linear prediction (LP) parametric representation. In the context of speech coding, this representation shows better quantization properties than the other LP parametric representations. In the present paper, the LSP representation is studied for speech recognition. Several distance measures based...
متن کاملSegmental vocoder-going beyond the phonetic approach
In our paper, the problem of very low bit rate segmental speech coding is addressed. The basic units are found automatically in the training database using temporal decomposition, vector quantization and multigrams. They are modelled by HMMs. The coding is based on recognition and synthesis. In single speaker tests, we obtained intelligible and naturally sounding speech at mean rate of 211.2 b/...
متن کاملSpeech Enhancement using Adaptive Data-Based Dictionary Learning
In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...
متن کاملMultigrams for language identification
In our paper we present two new approaches for language identification. Both of them are based on the use of so-called multigrams, an information theoretic based observation representation. In the first approach we use multigram models for phonotactic modeling of phoneme or codebook sequences. The multigram model can be used to segment the new observation into larger units (e.g. something like ...
متن کاملFixed-length Segment Codin
This paper presents a method to attain very low bit-rate compression of speech spectral envelope. It is based on fixed-length segment coding. The method utilizes Temporal Decomposition (TD) technique for the compact representation of segments of Line Spectrum Frequencies (LSF) vector followed by split matrix quantization. The TD technique is modified to fit fixed-length segment coding. Computat...
متن کامل